Search CORE

142 research outputs found

The Tree Inclusion Problem: In Linear Space and Faster

Author: Alstrup S.
Alstrup S.
Alstrup S.
Alstrup S.
Bender M. A.
Cole R.
Demaine E. D.
Ferragina P.
Inge Li Gortz
Muthukrishnan S.
Philip Bille
Schlieder T.
Termier A.
Yang L. H.
Zezula P.
Publication venue
Publication date: 01/01/2011
Field of study

Given two rooted, ordered, and labeled trees

P

and

T

the tree inclusion problem is to determine if

P

can be obtained from

T

by deleting nodes in

T

. This problem has recently been recognized as an important query primitive in XML databases. Kilpel\"ainen and Mannila [\emph{SIAM J. Comput. 1995}] presented the first polynomial time algorithm using quadratic time and space. Since then several improved results have been obtained for special cases when

P

and

T

have a small number of leaves or small depth. However, in the worst case these algorithms still use quadratic time and space. Let

n_S

l_S

, and

d_S

denote the number of nodes, the number of leaves, and the %maximum depth of a tree

S \in \{P, T\}

. In this paper we show that the tree inclusion problem can be solved in space

O(n_T)

and time: O(\min(l_Pn_T, l_Pl_T\log \log n_T + n_T, \frac{n_Pn_T}{\log n_T} + n_{T}\log n_{T})). This improves or matches the best known time complexities while using only linear space instead of quadratic. This is particularly important in practical applications, such as XML databases, where the space is likely to be a bottleneck.Comment: Minor updates from last tim

arXiv.org e-Print Archive

Crossref

Online Research Database In Technology

Controlled non uniform random generation of decomposable structures

Author: A. Denise
Berghen
Bertoni
Bostan
Brlek
Denise
Denise
Dershowitz
Drmota
Duchon
Dutour
Faugère
Flajolet
Flajolet
Flajolet
Flajolet
Fontana
Goldwurm
Greene
Hofacker
Hofacker
Jin
Lipshitz
M. Termier
Mathews
Mathews
Nebel
Nebel
Nicodème
Nijenhuis
Ponty
Salvy
Schönhage
van der Hoeven
Vauchaussade de Chaumont
Waterman
Y. Ponty
Publication venue: 'Elsevier BV'
Publication date: 01/01/2010
Field of study

Consider a class of decomposable combinatorial structures, using different types of atoms \Atoms = \{\At_1,\ldots ,\At_{|{\Atoms}|}\}. We address the random generation of such structures with respect to a size

n

and a targeted distribution in

k

of its \emph{distinguished} atoms. We consider two variations on this problem. In the first alternative, the targeted distribution is given by

k

real numbers \TargFreq_1, \ldots, \TargFreq_k such that 0 < \TargFreq_i < 1 for all

i

and \TargFreq_1+\cdots+\TargFreq_k \leq 1. We aim to generate random structures among the whole set of structures of a given size

n

, in such a way that the {\em expected} frequency of any distinguished atom \At_i equals \TargFreq_i. We address this problem by weighting the atoms with a

k

-tuple \Weights of real-valued weights, inducing a weighted distribution over the set of structures of size

n

. We first adapt the classical recursive random generation scheme into an algorithm taking \bigO{n^{1+o(1)}+mn\log{n}} arithmetic operations to draw

m

structures from the \Weights-weighted distribution. Secondly, we address the analytical computation of weights such that the targeted frequencies are achieved asymptotically, i. e. for large values of

n

. We derive systems of functional equations whose resolution gives an explicit relationship between \Weights and \TargFreq_1, \ldots, \TargFreq_k. Lastly, we give an algorithm in \bigO{k n^4} for the inverse problem, {\it i.e.} computing the frequencies associated with a given

k

-tuple \Weights of weights, and an optimized version in \bigO{k n^2} in the case of context-free languages. This allows for a heuristic resolution of the weights/frequencies relationship suitable for complex specifications. In the second alternative, the targeted distribution is given by a

k

natural numbers

n_1, \ldots, n_k

such that

n_1+\cdots+n_k+r=n

where

r \geq 0

is the number of undistinguished atoms. The structures must be generated uniformly among the set of structures of size

n

that contain {\em exactly}

n_i

atoms \At_i (

1 \leq i \leq k

). We give a \bigO{r^2\prod_{i=1}^k n_i^2 +m n k \log n} algorithm for generating

m

structures, which simplifies into a \bigO{r\prod_{i=1}^k n_i +m n} for regular specifications

arXiv.org e-Print Archive

HAL-CentraleSupelec

Elsevier - Publisher Connector

Crossref

INRIA a CCSD electronic archive server

HAL-Polytechnique

HAL-Rennes 1

Tree model guided candidate generation for mining frequent subtrees from XML

Author: Abe K.
Agrawal R.
Chi Y.
Elizabeth Chang
Fedja Hadzic
Feng L.
Ghoting A.
Henry Tan
Ling Feng
Nijssen S.
Sidhu A. S.
Suciu D.
Tan H.
Tan H.
Tan H.
Termier A.
Tharam S. Dillon
Wang C.
Xiao Y.
Yan X.
Yang L. H.
Zhang J.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2008
Field of study

Due to the inherent flexibilities in both structure and semantics, XML association rules mining faces few challenges, such as: a more complicated hierarchical data structure and ordered data context. Mining frequent patterns from XML documents can be recast as mining frequent tree structures from a database of XML documents. In this study, we model a database of XML documents as a database of rooted labeled ordered subtrees. In particular, we are mainly coneerned with mining frequent induced and embedded ordered subtrees. Our main contributions arc as follows. We describe our unique embedding list representation of the tree structure, which enables efficient implementation ofour Tree Model Guided (TMG) candidate generation. TMG is an optimal, non-redundant enumeration strategy which enumerates all the valid candidates that conform to the structural aspects of the data. We show through a mathematical model and experiments that TMG has better complexity compared to the commonly used join approach. In this paper, we propose two algorithms, MB3Miner and iMB3-Miner. MB3-Miner mines embedded subtrees. iMB3-Miner mines induced and/or embedded subtrees by using the maximum level of embedding constraint. Our experiments with both synthetic and real datasets against two well known algorithms for mining induced and embedded subtrees, demonstrate the effeetiveness and the efficiency of the proposed techniques

Crossref

espace@Curtin

New data on the morphology of Sphenothallus Hall: implications for its affinities

Author: Babcock L. E.
Bischoff G. C. O.
Bischoff G. C. O.
Bodenbender B. E.
Boucek B.
Brood K.
Choi D. K.
Clarke J. M.
Cox R. S.
Fauchald K.
Fauchald K.
Feldmann R. M.
Howell B. F.
Iten H.
Iten H.
Iten H.
Jones M. L.
Kiderlen H.
Kozlowski R.
Kozlowski R.
Mason C.
Muller K. J.
Price W. A.
Ruedemann R. H.
Ruedemann R. H.
Ruedemann R. H.
Salvini-Plawen L.
Schmidt W.
Schmidt W.
Termier H.
Termier H.
Werner B.
Werner B.
Werner B.
Wilson R. B.
Publication venue: 'Wiley'
Publication date: 01/04/1992
Field of study

Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/73676/1/j.1502-3931.1992.tb01378.x.pd

Crossref

Deep Blue Documents at the University of Michigan

The origin and dispersion of human parasitic diseases in the Old World (Africa, Europe and Madagascar)

Author: Ashford DW
Benallegue A
Bernard J
Bettencourt A
Blanc F
Blanc M
Bray RS
Brygoo ER
Brygoo ER
Chaline J
Clarke R
Cornevin R
Coudert J
Coulanges P
Dedet JP
Deschamp H
Dorst J
Euzéby J
Fraga de Azevedo J
Gaillard H
Hours F
Jean-Pierre Nozais
Killick Kendrick R
Leakey RE
Leblancq SM
Lombard M
Mandahl-Barth G
Moreno G
Moyroud J
Nougier LR
Petit-Maire N
Petter F
Petter JJ
Rage JC
Renault F
Ribot JJ
Rioux JA
Rioux JA
Taquet P
Termier G
Wendorf F
Publication venue: 'FapUNIFESP (SciELO)'
Publication date
Field of study

Crossref

Mining frequent closed rooted trees

Author: A. Termier
Albert Bifet
Antoni Lozano
B. Ganter
D. E. Knuth
D. E. Knuth
D. Shasha
G. Valiente
J. Hein
J. M. Plotkin
José L. Balcázar
K. Hashimoto
M. J. Zaki
R. Kohavi
S. Chakrabarti
S. Weiss
T. Beyer
X. Yan
X. Yan
X. Yan
Y. Chi
Y. Chi
Y. Chi
Y. Chi
Y. Xiao
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Automatic congestion detection in MPSoC programs using data mining on simulation traces

Author: Lagraa S.
Pétrot Frédéric
Termier A.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/10/2012
Field of study

ISBN : 978-1-4673-2786-2International audienceThe efficient deployment of parallel software, specifically legacy one, on Multiprocessor systems on chip (MPSoC) is a challenging task. In this paper, we introduce the use of a data-mining approach on traces of a functionally correct program to automatically identify recurring congestion points and their sources. Each memory transaction, i.e. instruction fetch, data load and data store, occurring in the system is logged, thanks to the use of a virtual platform of the system. The resulting trace is analyzed to discover memory access patterns that are occurring frequently and that feature high latencies. These patterns are sorted by order of decreasing occurrence and estimated congestion level, allowing the easy identification of the sources of inefficiency. We have simulated a MPSoC with 16 processors running multiple applications, and have been able to automatically detect congestion on resources and their sources in the parallel program using this technique by analyzing gigabytes of traces

Hal - Université Grenoble Alpes

Raising the Dead; Extending Evolutionary Algorithms with a Case-based Memory

Author: A. Termier
J. Eggermont
S. Poyhonen
T. Lenaerts
Publication venue
Publication date: 01/01/2001
Field of study

In dynamically changing environments, the performance of a standard evolutionary algorithm deteriorates. This is due to the fact that the population, which is considered to contain the history of the evolutionary process, does not contain enough information to allow the algorithm to react adequately to changes in the fitness landscape. Therefore, we added a simple, global case-based memory to the process to keep track of interesting historical events. Through the introduction of this memory and a storing and replacement scheme we were able to improve the reaction capabilities of an evolutionary algorithm with a periodically changing fitness function

CiteSeerX

Crossref

DI-fusion

BIOINFORMATICS APPLICATIONS NOTE GenRGenS: Software for Generating Random Genomic Sequences and Structures

Author: Alain Denise A
Michel Termier B
Yann Ponty A
Publication venue
Publication date
Field of study

Summary: GenRGenS is a software tool dedicated to randomly generating genomic sequences and structures. It handles several classes of models useful for sequence analysis, such as Markov chains, Hidden Markov models, weighted context-free grammars, regular expressions and PROSITE expressions. GenRGenS is the only program that can handle weighted context-free grammars, thus allowing the user to model and generate structured objects (such as RNA secondary structures) of any given desired size. GenRGenS also allows the user to combine several of these different models at the same time. Availability: Source and executable files of GenRGenS (in Java) and the complete user’s manual are freely available a

CiteSeerX